2 Hidden Layers

Programming a Neural Network with 2 hidden layers is essentially the same process as with a single hidden layer with a single extra step. Because there is an extra layer, there is a third weight array and an extra level of gradient descent required. Thus there is a bit more math, and a few extra lines of code.


In [17]:
import NeuralNetImport as NN
import numpy as np
from sklearn.datasets import load_digits 
digits = load_digits()
import NNpix as npx
from IPython.display import HTML
from IPython.display import display

Neuron with 2 Hidden Layers


In [18]:
npx.cneuron2


Out[18]:

Gradient Descent with 2 Hidden Layers


In [19]:
npx.derivation2


Out[19]:

In [20]:
f = open("HTML2.html")

In [21]:
display(HTML(f.read()))


Diagram Equations Partial Derivatives
$\hat{y}$
$\hat{y}=\sigma(N)$ $\frac{\partial\hat{y}}{\partial N} = \sigma\prime(N) $
N
$ N = \sigma(N) \times w3 $ $ \frac{\partial N}{\partial M} = \sigma\prime(M) \times w_3 $ $ \frac{\partial N}{\partial w_3} = \sigma(M) $
M
$ M = \sigma(L) \times w2 $ $ \frac{\partial M}{\partial L} = \sigma\prime(L) \times w_2 $ $ \frac{\partial M}{\partial w_2} = \sigma(L) $
L
$ L = K \times w1 $ $ \frac{\partial L}{\partial w_1} = K $

Gradients with Chain Rule Gradients with Substitution
$w_3$ $ \frac{\partial{C}}{\partial{w_3}} = -(y-\hat{y}) \frac{\partial{\hat{y}}}{\partial{N}} \frac{\partial{N}}{\partial{w_3}}$ $ \frac{\partial{C}}{\partial{w_3}} = -(y - \hat{y}) \times \sigma \prime(N) \times \sigma(M) $
$w_2$ $ \frac{\partial{C}}{\partial{w_2}} = -(y-\hat{y}) \frac{\partial{\hat{y}}}{\partial{N}} \frac{\partial{N}}{\partial{M}} \frac{\partial{M}}{\partial{w_2}}$ $ \frac{\partial{C}}{\partial{w_2}} = -(y-\hat{y}) \times \sigma\prime(N) \times \sigma\prime(M) \times w_3 \times \sigma(L) $
$w_3$ $\frac{\partial{C}}{\partial{w_1}} = -(y-\hat{y}) \frac{\partial{\hat{y}}}{\partial{N}} \frac{\partial{N}}{\partial{M}} \frac{\partial{M}}{\partial{L}} \frac{\partial{L}}{\partial{w_1}}$ $ \frac{\partial{C}}{\partial{w_1}} = -(y-\hat{y}) \times \sigma\prime(N) \times \sigma\prime(M) \times w_3 \times \sigma\prime(L) \times w_2 \times K $

In [22]:
f.close()

Create Training Inputs and Solutions

Use 1000 random samples to generate an input and solution. The other 797 will be used to test.


In [23]:
perm = np.random.permutation(1792)
training_input = np.array([digits.images[perm[i]].flatten() for i in range(1000)])/100

In [24]:
training_solution = NN.create_training_soln(digits.target[perm], 10)

In [25]:
train = NN.NN_training_2(training_input, training_solution, 64, 10, 60, 50, 80, 0.7)

Getting Weights

To find weights, use the commented out line below.


In [33]:
# x,y,z = train.train()

In [34]:
f = np.load("2HiddenWeights.npz")

In [35]:
x = f['arr_0']
y = f['arr_1']
z = f['arr_2']

In [36]:
assert len(x) == 60
assert len(y) == 50
assert len(z) == 10

Find Solutions


In [37]:
ask = [NN.NN_ask_2(np.array([digits.images[perm[i]].flatten()])/100,x,y,z) for i in range(1000,1792)]

In [38]:
comp_vals = [ask[i].get_ans() for i in range(len(ask))]

Calculate Accuracy


In [39]:
print((sum(((comp_vals - np.array([digits.target[perm[i]] for i in range(1000,1792)]) == 0).astype(int)) / 792 * 100)), "%")


97.7272727273 %

In [ ]: